I Can Guess What You Mean: A Monolingual Query Enhancement for Machine Translation

نویسندگان

Chenxi Pang

Hai Zhao

Zhongyi Li

چکیده

We introduce a monolingual query method with additional webpage data to improve the translation quality for more and more official use requirement of statistical machine translation outputs. The motivation behind this method is that we can improve the readability of sentence once for all if we replace translation sentences with the most related sentences generated by human. Based on vector space representations for translated sentences, we perform a query on search engine for additional reference text data. Then we rank all translation sentences to make necessary replacement from the query results. Various vector representations for sentence, TFIDF, latent semantic indexing, and neural network word embedding, are conducted and the experimental results show an alternative solution to enhance the current machine translation with a performance improvement about 0.5 BLEU in French-to-English task and 0.7 BLEU in English-to-Chinese task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EXETER at CLEF 2002: Experiments with Machine Translation for Monolingual and Bilingual Retrieval

This year, the University of Exeter participated in both the CLEF 2002 monolingual and bilingual task for two languages: Italian and Spanish. We submitted 4 ranked results each for both Italian and Spanish Monolingual tasks and 5 each for the bilingual tasks. We report experimental results from our investigations of merging topic translations from two machine translation (MT) systems and recent...

متن کامل

Query Rewriting Using Monolingual Statistical Machine Translation

Long queries often suffer from low recall in Web search due to conjunctive term matching. The chances of matching words in relevant documents can be increased by rewriting query terms into new terms with similar statistical properties. We present a comparison of approaches that deploy user query logs to learn rewrites of query terms into terms from the document space. We show that the best resu...

متن کامل

MT-Based Query Translation CL1R Meets Frequent Case Generation

The paper introduces the evaluation results of Cross Language Information Retrieval(CLIR) for three target languages, Finnish, German and Swedish using English as the source language. Our CLLR approach is based on machine translation of topics and usage of the Frequent Case Generation (FCG) method for management of query term variation in translated topics and retrieval in inflected indexes. Re...

متن کامل

University of Hagen at CLEF 2005: Towards a Better Baseline for NLP Methods in Domain-Specific Information Retrieval

The third participation of the University of Hagen at the German Indexing and Retrieval Test (GIRT) task of the Cross Language Evaluation Campaign (CLEF 2005) aims at providing a better baseline for experiments with natural language processing (NLP) methods in domainspecific information retrieval (IR). Our monolingual experiments with the German document collection are based on a setup combinin...

متن کامل

Cross-Lingual Information Retrieval System for Indian Languages

This paper describes our first participation in the Indian language sub-task of the main Adhoc monolingual and bilingual track in CLEF competition. In this track, the task is to retrieve relevant documents from an English corpus in response to a query expressed in different Indian languages including Hindi, Tamil, Telugu, Bengali and Marathi. Groups participating in this track are required to s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2016

I Can Guess What You Mean: A Monolingual Query Enhancement for Machine Translation

نویسندگان

چکیده

منابع مشابه

EXETER at CLEF 2002: Experiments with Machine Translation for Monolingual and Bilingual Retrieval

Query Rewriting Using Monolingual Statistical Machine Translation

MT-Based Query Translation CL1R Meets Frequent Case Generation

University of Hagen at CLEF 2005: Towards a Better Baseline for NLP Methods in Domain-Specific Information Retrieval

Cross-Lingual Information Retrieval System for Indian Languages

عنوان ژورنال:

اشتراک گذاری